Approximate measurement invariance in cross-classified rater-mediated assessments

نویسندگان

  • Ben Kelcey
  • Dan McGinn
  • Heather Hill
چکیده

An important assumption underlying meaningful comparisons of scores in rater-mediated assessments is that measurement is commensurate across raters. When raters differentially apply the standards established by an instrument, scores from different raters are on fundamentally different scales and no longer preserve a common meaning and basis for comparison. In this study, we developed a method to accommodate measurement noninvariance across raters when measurements are cross-classified within two distinct hierarchical units. We conceptualized random item effects cross-classified graded response models and used random discrimination and threshold effects to test, calibrate, and account for measurement noninvariance among raters. By leveraging empirical estimates of rater-specific deviations in the discrimination and threshold parameters, the proposed method allows us to identify noninvariant items and empirically estimate and directly adjust for this noninvariance within a cross-classified framework. Within the context of teaching evaluations, the results of a case study suggested substantial noninvariance across raters and that establishing an approximately invariant scale through random item effects improves model fit and predictive validity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing results of an exact vs. an approximate (Bayesian) measurement invariance test: a cross-country illustration with a scale to measure 19 human values

One of the most frequently used procedures for measurement invariance testing is the multigroup confirmatory factor analysis (MGCFA). Muthén and Asparouhov recently proposed a new approach to test for approximate rather than exact measurement invariance using Bayesian MGCFA. Approximate measurement invariance permits small differences between parameters otherwise constrained to be equal in the ...

متن کامل

Impact of Not Addressing Partially Cross-Classified Multilevel Structure in Testing Measurement Invariance: A Monte Carlo Study

In educational settings, researchers are likely to encounter multilevel data with cross-classified structure. However, due to the lack of familiarity and limitations of statistical software for cross-classified modeling, most researchers adopt less optimal approaches to analyze cross-classified multilevel data in testing measurement invariance. We conducted two Monte Carlo studies to investigat...

متن کامل

Cross-cultural invariance of the MTI 1 Running head: Cross-cultural invariance of the MTI Cross-Cultural Invariance of the Mental Toughness Inventory among Australian, Chinese, and Malaysian Athletes: A Bayesian Estimation Approach

1 The aims of this study were to assess the cross-cultural invariance of athletes’ self-reports of mental 2 toughness, and introduce and illustrate the application of approximate measurement invariance 3 using Bayesian estimation for sport and exercise psychology scholars. Athletes from Australia (n = 4 353, Mage = 19.13, SD = 3.27, males = 161), China (n = 254, Mage = 17.82, SD = 2.28, males =...

متن کامل

Sequential Effects in Essay Ratings: Evidence of Assimilation Effects Using Cross-Classified Models

Writing assessments are an indispensable part of most language competency tests. In our research, we used cross-classified models to study rater effects in the real essay rating process of a large-scale, high-stakes educational examination administered in China in 2011. Generally, four cross-classified models are suggested for investigation of rater effects: (1) the existence of sequential effe...

متن کامل

Item response theory: applications of modern test theory in medical education.

CONTEXT Item response theory (IRT) measurement models are discussed in the context of their potential usefulness in various medical education settings such as assessment of achievement and evaluation of clinical performance. PURPOSE The purpose of this article is to compare and contrast IRT measurement with the more familiar classical measurement theory (CMT) and to explore the benefits of IR...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2014